Introduction to Cross-Entropy Clustering The R Package CEC

نویسندگان

  • Jacek Tabor
  • Przemyslaw Spurek
  • Konrad Kamieniecki
  • Marek Smieja
  • Krzysztof Misztal
چکیده

The R Package CEC Kamieniecki and Spurek (2014) performs clustering based on the cross–entropy clustering (CEC) method, which was recently developed with the use of information theory. The main advantage of CEC is that it combines the speed and simplicity of k-means with the ability to use various Gaussian mixture models and reduce unnecessary clusters. In this work we present a practical tutorial to CEC based on the R Package CEC. Functions are provided to encompass the whole process of clustering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-entropy clustering

We build a general and highly applicable clustering theory, which we call cross-entropy clustering (shortly CEC) which joins advantages of classical kmeans (easy implementation and speed) with those of EM (affine invariance and ability to adapt to clusters of desired shapes). Moreover, contrary to k-means and EM, CEC finds the optimal number of clusters by automatically removing groups which ca...

متن کامل

Cross-Entropy Clustering Approach to One-Class Classification

Cross-entropy clustering (CEC) is a density model based clustering algorithm. In this paper we present a possible application of CEC to the one-class classification, which has several advantage over classical approaches based on Expectation Maximization (EM) and Support Vector Machines (SVM). More precisely, we can use various types of gaussian models with lower computational complexity. We tes...

متن کامل

Semi-supervised cross-entropy clustering with information bottleneck constraint

In this paper, we propose a semi-supervised clustering method, CECIB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By combining the ideas from cross-entropy clustering (CEC) with those from the information bottleneck method (IB), our method trades between three conflicting g...

متن کامل

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1508.04559  شماره 

صفحات  -

تاریخ انتشار 2015